-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make CheckUrlExists work on Node18 #157
Conversation
This reverts commit 27ebad0.
Yes, groupId changed. Old version didn't work with Java 17
5024f97
to
482bb15
Compare
if (parsedUrl.port) { | ||
options = { port: parsedUrl.port, ...options }; | ||
} | ||
await ftpClient.access(options); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this or any other await
fails, it will throw an exception, which will be caught on line 46/36 (note to self and anybody else who isn't familiar with JS promises and await
.
} | ||
await ftpClient.access(options); | ||
const size = await ftpClient.size(parsedUrl.path); | ||
return size > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it plausible that a valid file will be empty? If so, >=
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I was thinking an empty file would never be valid from a workflow input perspective. But I could see it both ways. I'm going to leave it as is for now, but if you or anybody feels strongly, I can change it.
Thought we did s3 and gcp, but it looks like at least s3 is converted to https? https://docs.dockstore.org/en/stable/faq.html?highlight=open%20data#what-is-an-open-data-tool-or-workflow-version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check on the story with s3 urls
@@ -7,10 +7,10 @@ | |||
"author": "Dockstore", | |||
"license": "MIT", | |||
"dependencies": { | |||
"node-libcurl": "^2.3.4" | |||
"basic-ftp": "^5.0.3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, basic-ftp has more dependents and more downloads, so this is probably a good move in terms of maintainability
https://www.npmjs.com/package/basic-ftp?activeTab=readme
https://www.npmjs.com/package/node-libcurl
@@ -35,6 +35,10 @@ describe("Tests index", function () { | |||
"ftp://ftp.this.is.fake.or.private.1000genomes.ebi.ac.uk/vol1/ftp/CHANGELOG"; | |||
await setupTest(url, false); | |||
}); | |||
it("handles an unknown protocol", async () => { | |||
const url = "s3://bucket/objectkey"; // We don't natively support s3 URIs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on the other hand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on the other hand
node-libcurl doesn't support s3 either. We've been getting around this in the web service by converting s3 and gs urls to https urls before invoking the lambda.
Per node-libcurl npm home page, we are losing support for the DICT, FILE, Gopher, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS, Telnet and TFTP protocols, but I don't think any workflows have test parameter files with those protocols
Saw this comment after your inline comment... Yes we convert s3 and gs URIs to https URLs in the web service. In hindsight, maybe we should have done the conversion here. |
Description
Instead of using node-libcurl, use node-ftp and follow-redirects (which uses http and https) to check if urls exist.
Why it broke
The fix
One option was to use a lambda layer to provide the native dependencies. It's a bit tricky, and I created a support case with AWS to figure it out. See JIRA issue for details. I ultimately decided against that approach because:
So, I just got rid of node-libcurl, and I'm using other node libraries. The cons:
Issue
SEAB-5759
Security
If there are any concerns that require extra attention from the security team, highlight them here.
Please make sure that you've checked the following before submitting your pull request. Thanks!